-
Notifications
You must be signed in to change notification settings - Fork 226
feat(gemma3): Add BiGemma3 and ColGemma3 models with Matryoshka embeddings #362
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Introduced BiGemma3 and BiGemmaProcessor3 for image and text processing. - Added ColGemma3 and ColGemmaProcessor3 for late interaction retrieval. - Implemented model and processor classes with appropriate forward methods. - Created unit tests for BiGemma3 and ColGemma3 models and their processors. - Ensured compatibility with existing Gemma3 architecture and added necessary processing utilities.
…mension validation and improved processor loading
- Implemented offline and online testing for BiGemma3 using Matryoshka embeddings. - Created synthetic images and queries for testing across multiple dimensions (768, 1536, 2560). - Validated image and query encoding, similarity scoring, and retrieval performance. - Configured Modal app with necessary dependencies and environment settings. - Added comprehensive logging and validation checks for test results.
- Implemented `serve_hf_snapshot.py` for HuggingFace model serving with optimized cold start and warmup. - Introduced `serve_vllm_snapshot.py` for vLLM model serving with sleep mode and GPU memory snapshots. - Added comprehensive benchmark report for inference performance in `INFERNECE_PERFORMANCE.md`. - Both scripts support FastAPI endpoints for embedding generation and health checks. - Configured deployment settings including GPU type, memory, and scaledown behavior.
…arameter and adjust forward method for validation
Inference demo
|
Will look at this tmrw ! Thanks ! |
|
@adithya-s-k could you add some interpretability maps from your tests? Just checking how they differ from colmodernvbert and colqwen3 |
…bility in generate_interpretability_maps.py
|
@athrael-soju , i have pushed the code for the interpretability do check it out |
ManuelFay
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM but can you please ruff the code and fix the tests so CI pass ?
|
for ruff it's probably not even on you. the CI was disabled last week with the shai hulud bug so ruff checks didn't pass on one iof the merged PRs. |
|
The gemma tests because of the model gating. Do you have a base model (like we do for all the supported architectures) that is initialized with the final projection ? Otherwise init will be randonm everytime if we start from gemma (+ the gating problem). |
|
Hey i have just pushed two model
also the final checkpoints can be used |
|
yeah looks good ! can you update the PR to include them so we run the CI again ? |
…fied - Adithya S K
|
Hi @ManuelFay , I have made all the requested changes. I have locally verified that the tests now run correctly with these models and everything looks good on my side. |
|
Awesome, I'll merge my linting PR and then merge yours ! Thanks a ton ! |
|
can you rebase on main (will make the ruff CI happy) ? then we will merge ! |
ManuelFay
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only 1 ruff error remaining. Then we can merge, rest looks nice !
Thanks again !
|
@ManuelFay have fixed the ruff issue , i think everything should be set to merge the pr |
|
We should document more clearly how to ruff but the CI is still failing. You need to run ruff format --check. I approved and will merge right after |
|
@ManuelFay have run it locally and tested, it should pass all the checks now |
|
Thank you for the contributions ! Don't hesitate to submit your model results on the MTEB visual retrieval leaderboard ! |
Summary
This PR adds support for BiGemma3 and ColGemma3 models based on the Gemma3-4B-IT backbone, enabling multilingual multimodal document retrieval.
Key Features
BiGemma3: Single-vector dense retrieval model with Matryoshka representation learning
ColGemma3: Multi-vector late interaction model using ColBERT-style architecture
Changes
Added
BiGemma3model incolpali_engine/models/gemma3/bigemma3/modeling_bigemma.py: Model implementation with Matryoshka supportprocessing_bigemma.py: Processor for images and textAdded
ColGemma3model incolpali_engine/models/gemma3/colgemma3/modeling_colgemma.py: Multi-vector model implementationprocessing_colgemma.py: Processor with MaxSim scoringAdded comprehensive tests in
tests/models/gemma3/API Design
BiGemma3 allows choosing embedding dimension at inference time:
ColGemma3 uses standard multi-vector late interaction:
Models
Related Work